A memetic grammar inference algorithm for language learning

نویسندگان

  • Dejan Hrncic
  • Marjan Mernik
  • Barrett R. Bryant
  • Faizan Javed
چکیده

An unsupervised incremental algorithm for grammar inference and its application to domain-specific language development are described. Grammatical inference is the process of learning a grammar from the set of positive and optionally negative sentences. Learning general context-free grammars is still considered a hard problem in machine learning and is not completely solved yet. The main contribution of the paper is a newly developed memetic algorithm, which is a population-based evolutionary algorithm enhanced with local search and a generalization process. The learning process is incremental since a new grammar is obtained from the current grammar and false negative samples, which are not parsed by the current grammar. Despite being incremental, the learning process is not sensitive to the order of samples. All important parts of this algorithm are explained and discussed. Finally, a case study of a domain specific language for rendering graphical objects is used to show the applicability of this approach. © 2011 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Embedding DSLs into GPLS: a Grammatical Inference Approach

Embedding of Domain-Specific Languages (DSLs) into General-Purpose Languages (GPLs) is often used to express domain-specific problems using the domain’s natural syntax inside GPL programs. It speeds up the development process, programs are more self-explanatory and repeating tasks are easier to handle. End-users or domain experts know what the desired language syntax would look like, but do not...

متن کامل

MMDT: Multi-Objective Memetic Rule Learning from Decision Tree

In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...

متن کامل

Learning Stochastic Context-Free Grammars from Corpora Using a Genetic Algorithm

A genetic algorithm for inferring stochastic context-free grammars from nite language samples is described. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. We describe a number of experiments in learning grammars for a range of formal languages. The results of these experiments are encouraging and compare very favour...

متن کامل

Context–free grammar induction using evolutionary methods

The research into the ability of building self-learning natural language parser based on context–free grammar (CFG ) was presented. The paper investigates the use of evolutionary methods: a genetic algorithm, a genetic programming and learning classifier systems for inferring CFG based parser. The experiments were conducted on the real set of natural language sentences. The gained results confi...

متن کامل

Grammar Induction and Genetic Algorithms: An Overview

Grammar Induction (also know as Grammar Inference or Language Learning) is the process of learning of a grammar from training data. This paper discusses the various approaches for learning context-free grammar (CFG) from the corpus of string and presents the approach of informant learning in the form of result for two standard grammar problems namely Balanced Parenthesis Grammar and Palindrome ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Appl. Soft Comput.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2012